Genomic insights into differentiation and adaptation of Amorphophallus yunnanensis in the mountainous region of Southwest China

Abstract The role of geographical isolation and environmental adaptation in driving the differentiation and radiation of species has been a hotspot in evolutionary biology. The extremely complicated and fragmented geography of the mountainous region of Southwest China provides an excellent system for investigating the process of species divergence in heterogeneous habitats. Amorphophallus yunnanensis is a species of extreme habitat preference that resides mainly in the mountainous region of Southwest China. Here, we used restriction site‐associated DNA sequencing (RAD‐seq) to characterize the geographic pattern of genetic variation among 19 populations of A. yunnanensis as well as the genomic basis of environmental adaptation. A pattern of low population genetic diversity and high level of genetic differentiation was observed. The genomic data revealed a clear east–west genetic differentiation, with two distinct genetic lineages corresponding to the Guizhou plateau and Yunnan plateau, respectively. However, we discovered demographic expansion of the Guizhou Plateau lineage and recent hybridization in populations at the contact region. Significant levels of isolation by distance along with isolation by environment were detected. Outlier tests and genome–environment association analyses identified 89 putatively adaptive loci that might play a role in environmental adaptation. Our results suggest that the genetic divergence of A. yunnanensis is attributed to geographical isolation together with divergent selection in the mountainous region of Southwest China.


| INTRODUC TI ON
The genetic divergence of species in heterogeneously isolated habitats has long been a hotspot in evolution studies (Hu et al., 2022;Wang et al., 2019).For species in terrestrially isolated habitats with limited distribution abilities, such as plants, migration and gene flow among populations are restricted to narrow regions, and therefore geographic isolation patterns similar to islands are expected (Gao et al., 2015;Han et al., 2017).For example, a high level of genetic divergence as well as low gene flow among populations has been observed in plant species endemic to terrestrial island-like habitats (Vasquez et al., 2016).Besides, other studies propose that local adaption may contribute to the allopatric divergence of species in terrestrially isolated habitats (Guo et al., 2016).
The mountainous region of Southwest China is known for its complicated geography causing extreme spatial isolation of resident organisms (Wambulwa et al., 2022;Wang et al., 2013).Due to the diverse topography and ecology, this region has long been regarded as a global hotspot of biodiversity (López-Pujol et al., 2011).These characteristics make Southwest China an excellent system for studying processes that drive the geographical distribution of biodiversity (Liu et al., 2021).One important question is whether genetic differentiation is driven by complete geographical isolation or by environmental selection.Some researchers claim that fragmented habitats facilitated population isolation and allopatric diversification (Kai & Jiang, 2014).Hu et al. (2022) found that both the complex topography and local selection were major factors causing species divergence in the mountainous region of Southwest China.Nonetheless, more studies are needed to uncover mechanisms underlying the genetic differentiation patterns of the flora in Southwest China.
The genus Amorphophallus Bl. (Araceae) contains around 219 perennial species distributed across the tropical and subtropical regions of the Palaeotropics from West Africa to Pacific islands (Mayo et al., 1997).Southwest China harbors 17 Amorphophallus species, eight species of which are endemic (Li & Hetterscheid, 2010).A. yunnanensis is a species located in the mountain ranges of Southwest China and the northern regions of Laos, Thailand, and Vietnam.This species prefers shaded and humid locations with altitudes ranging from 200 to 3300 m, such as primary or secondary forests, thickets, and forest margins (Li & Hetterscheid, 2010).The plant of A. yunnanensis is characterized by a solitary leaf and an underground stem, and massive morphological variations have been observed across individuals.The petiole varies from medium to dark olive-green or dark olive-brown with rhombic or narrowly elliptic spots in whitish or greenish color (Li & Hetterscheid, 2010).In addition, the appendix of the flower also shows diversifications with several colors observed (Figure 1).Due to its high sensitivity to sunlight and temperature, A. yunnanensis is scattered throughout Southwest China in small, isolated populations, making this species an ideal model for inferring the effect of geographic isolation on genetic divergence.
Development in sequencing technologies has created an opportunity to uncover genomic variations from nonmodel organisms.
Recently, the first genome assembly in the Amorphophallus genus (A.konjac) has been reported, which provides valuable genomic resources for inferring the evolutionary and adaptation mechanisms of Amorphophallus (Gao et al., 2022).However, species in the Amorphophallus genus have large genomes , so reduced-representation library sequencing is a more effective technology for such taxa than whole-genome sequencing (Gargiulo et al., 2021;Tao et al., 2021).Restriction-site-associated DNA sequencing (RAD-seq) has been employed in population genetic studies of several Amorphophallus species, such as A. paeoniifolius and A. albus (Gao et al., 2017(Gao et al., , 2021)).While previous research focuses primarily on the genomic variation of cultivated resources of Amorphophallus, the evolutionary history of wild populations has not been investigated.
The previous study based on SSR markers described the genetic diversity and population structure of A. yunnanensis (Yin et al., 2021).
However, merely several SSRs lacked the power to infer fine-scale genetic patterns and functional genomic adaptation.Here, we used RAD-seq to explore the roles of diverse topography and ecology in causing the genetic differentiation of A. yunnanensis in Southwest China.In this study, we (1) draw the genetic variation pattern of this species, (2) infer the demographic history, and (3) investigate the effect of topographic heterogeneity and ecological variation in shaping population differentiation of A. yunnanensis.

| Sample collection
A population survey and sample collection of A. yunnanensis was conducted across southwest China from 2018 to 2020.The sampling nearly covers all potential distribution areas in Southwest China (Yunnan, Guizhou, Guangxi, and Hunan provinces), and leaf samples were randomly selected at intervals of at least 5 m for each population.All 125 individuals were sampled from 19 populations (Table 1 and Figure 2a).

| Restriction-site-associated DNA sequencing
RAD-seq was performed using established protocols (Baird et al., 2008).In brief, genomic DNA was digested with EcoRI endonuclease and then randomly fragmented during the library construction procedure.RAD libraries were paired-end (PE 150) sequenced by an Illumina NovaSeq 6000 platform.Prior to SNP calling, the genomic data were filtered to remove reads with unidentified nucleotides (Ns) or the proportion of low-quality bases (Phred value < 5) larger than 50%.We used STACKS v2.54 to assemble the sequencing data and genotype each individual (Rochette et al., 2019).During assembly, the maximum distance allowed between stacks (−M) was set to 3 and the parameter for creating putative alleles (−m) was set to 2.
For SNP genotyping, a series of filtering parameter combinations (minimum percentage of individuals in which loci occurred per population (r), minimum number of populations successfully genotyped (p), and minimum percentage of loci presenting across all individuals (R)) were tested to identify the one that maximized SNP numbers and minimized genotyping errors (Table S1).Finally, the parameter set "r75p16R85," which ensured an average of less than 10% missing genotypes per individual, was selected (Table S1).We only kept the first SNP of each locus to avoid linkage bias.In addition, another dataset with the incorporation of one population of A. paeoniifolius was also generated for analyzing gene flow between populations.

| Population genetic diversity and differentiation
Population genetic diversity statistics of SNP data were calculated with the "populations" program in STACKS (Rochette et al., 2019), including the percentage of polymorphic SNP loci (PPL), observed (H O ) and expected heterozygosity (H E ), nucleotide diversity (π), and inbreeding coefficient (F IS ).Populations with less than five individuals (FYND and DRHL) were excluded from all genetic diversity analyses to avoid the bias caused by small population sizes.Population differentiation (F ST ) between each pair of the 19 populations was calculated by the "populations" program, and the significance was tested by Arlequin v3.5 with 1000 permutations (Excoffier & Lischer, 2010;Rochette et al., 2019).

| Population structure
We used STRUCTURE v2.3.4 to infer the population differentiation pattern of A. yunnanensis (Pritchard et al., 2000).STRUCTURE calculations were conducted with 50,000 burn-ins and additional 100,000 Markov Chain Monte Carlo (MCMC) chains.The K values were set from 1 to 10, with 10 replicates for each K. Besides, we calculated the co-ancestry matrix among all individuals using fin-eRADstructure (Malinsky et al., 2018).The simulation of fineRADstructure was run by 100,000 burn-ins and another 100,000 MCMC chains that were sampled every 1000 steps.We also draw principal component analysis (PCA) plots of all individuals with the R package "adegenet" (Jombart, 2008).Finally, a maximum likelihood (ML) tree was constructed using IQ-TREE v1.6 with 1000 bootstraps (Nguyen et al., 2015).
AMOVA was then conducted to assess the genetic variance among different hierarchical levels (among groups, among populations within groups, and within populations) with 9999 permutations.

| Demographic history and gene flow
Stairway Plot v2 was utilized to infer the demographic history of two major lineages of A. yunnanensis based on genomic SNPs (Liu & Fu, 2015).The one-dimensional folded SFS dataset was generated using easySFS (https:// github.com/ isaac overc ast/ easySFS).The TA B L E 1 Geographic and genetic characteristics of Amorphophallus yunnanensis populations from Southwest China based on SNP data.at the time t a and t b , respectively.All parameters were set from 10 to 1000,000, and a total of 2,000,000 permutations were tested with 1000,000 permutations per scenario.
We used the software Treemix v1.13 to test for gene flow between populations of A. yunnanensis (Pickrell & Pritchard, 2012).For the analysis, we tested up to five migration events (m) among populations, and one population of A. paeoniifolius was used as an "outgroup."The optimal number of migration edges was determined by the secondorder rate of change across m (Δm) calculated by OptM (Fitak, 2021).

| Correlations between genetic, geographic, and environmental factors
We tested isolation by distance (IBD) and isolation by environment (IBE) of A. yunnanensis.Geographic distance between populations was calculated from geographic coordinates using GenAlex v6.5 (Peakall & Smouse, 2012).Nineteen climatic variables with a resolution of 2.5 arcminute for each population were downloaded from the WorldClim website (https:// www.world clim.org/ ; Table S2).To reduce dimensionality, principal components analysis (PCA) for climatic variables was conducted using SPSS Statistics v17.0.The first three PCs explained 92.79% of the total variation (Table S3).Thereby, the data of the first three PCs were used in the follow-up environmental correlation analysis.The pairwise environmental distances between each population were calculated using the "ecodist" package (Goslee & Urban, 2007).

| Genomic signatures of adaptation
BayPass v2.1 was used to scan the genome of A. yunnanensis for signatures of adaptive divergence (Gautier, 2015).First, we calculated the XtX statistics with all SNPs.Then, we simulated a pseudoobserved dataset (POD) of 10,000 SNPs to calibrate the XtX statistics.Lastly, we identified outliers based on a 1% threshold of the XtX statistics.In addition, we performed genome-environment association (GEA) analyses using BayPass v2.1, and SNP loci were considered as significantly associated with climatic variables by the 1% threshold of the XtX statistics.
To identify putative genes linked to outliers, we extracted candidate genes within 15 kb of the outlier loci from the annotated genome of Amorphophallus konjac.Briefly, we aligned the assembled sequence  of each outlier locus to the A. konjac genome by BLASTN v2.2.28+.
Potential functions of genes were annotated with the Gene Ontology (GO), KEGG, SwissProt, and nonredundant protein (Nr) databases.

| Genetic diversity and differentiation
A total of 702.89 Gb of sequencing data was generated from 125 samples of A. yunnanensis using RAD-seq.After filtering, 626.51 Gb of clean data was retained, and the amount of sequencing data per individual ranged from 3.01 to 9.02 Gb, with a mean of 5.01 Gb.The number of sequencing reads per individual ranged from 10,064,414 to 30,061,808.The assembly results showed that the average depth of each sample was 9.92× (Table S4).The resulting dataset consists of 19,034 high-quality SNP loci with an average missing rate per individual of 8.89% (Table S1

| Population structure
STRUCTURE analysis based on genomic SNPs detected two distinct genetic clusters (Figure S1).The two clusters were comprised of an east group distributed in the Guizhou plateau and adjacent areas, and a west group mainly in the Yunnan plateau (Figure 2a,b).The 19 sampled populations of A. yunnanensis were essentially clustered by geographic proximity, and introgressions were found in four populations located in the adjacent region.Besides, the STRUCTURE results of K = 3 and 4 also suggested the genetic uniqueness of populations in the contact region of two plateaus (Figure 2b).The heatmap of fineRADstructure suggested two main genetic groups with a pattern of east-west genetic differentiation.In addition, we observed a subdivision within the west group (Figure 3).The ML phylogenetic tree also identified two clades with a high support value (Figure 4).The PCA analysis pointed out that the first three PCs represented 15.70%, 7.41%, and 5.04% of the total variation, respectively.The PCA plots based on PC1 and PC2 as well as PC1 and PC3 all supported a genetic divergence between east and west groups (Figure 5).The distribution of genetic variance in A. yunnanensis was quantified using AMOVA.When assigning populations into two genetic groups, 49.26% of the variation was detected among groups, 19.51% among populations, and 31.23%within populations (all p-value < .001;Table 3).

| Demographic history
The stairway plot detected that the population size of the west group had been declining ever since 0.1 million years ago (Mya).In addition, the population size of the east group declined at around 0.5 Mya, and then a recent expansion occurred around 0.2 Mya (Figure 6).ABC analyses suggested that scenario 2 was the best-supported model (Figure 7).The divergence time of the two genetic groups (t d ) was 0.74 Mya, and both groups have undergone historical population declines.
Considering a life cycle of 5 years for A. yunnanensis, the effective population size of the west group declined at around 0.04 Mya (t a ), and the current population size (N 1 ) was 10 times smaller than that before the decline (N 1a ).Besides, the population size of the east group declined around 0.48 Mya with a reduction of 45% (Table 4).For the F I G U R E 6 Estimates of the effective population size of Amorphophallus yunnanensis through time based on genomic SNPs using stairway plot.
analysis of gene flow among populations, two migration events were demonstrated based on the Δm values calculated by OptM (Figure S2).
A high level of gene flow was suggested from the common ancestor of the east group to the population LPJLPB, and another weak signal was detected from A. paeoniifolius to the west group (Figure 8).

| Genomic signatures of adaptation
For 19 climatic variables, the first three PCs summarized 42.81%, 28.78%, and 21.77% of the variation, respectively (Table S3).All Mantel tests suggested significant IBD and IBE patterns of A. yunnanensis (p < .001).However, the correlation between genetic and environmental distances (r = .422)was lower than that between genetic and geographic distances (r = .654)(Figure 9).For the simulation of BayPass, Spearman's correlation revealed that the posterior estimate of Ω between POD and the real data was 0.998 (Figure S3).This that the simulated data faithfully mimic real datasets and the selected 1% threshold was robust to detect outlier loci.The genome scan based on the XtX calibration identified 62 outliers.For the GEA analyses, 46, 39, and 46 loci were found to be significantly associated with Clim_PC1, Clim_PC2, and Clim_PC3, respectively (Table S5 and Figure S4).| 11 of 15 GAO et al.
The potential gene functions of identified loci were annotated using the A. konjac genome.To ensure reliable annotations, we only considered loci with greater than 90% identity and alignment length >200 bp.Twenty-four out of the 89 loci had BLAST hits to genes, and most of the annotated loci were involved in the molecular function categories, such as nucleotide binding, protein kinase, protein binding, hydrolase, and catalytic activity (Table S6).

| Drivers shaping the genetic variation of A. yunnanensis
Finding forces that influence the genetic variation of species across the geographical landscape is the ultimate destination of genetic diversity research (Wambulwa et al., 2022).The fixation index expresses the degree of genetic differentiation, and a value of >0.25 is usually considered as a high genetic differentiation level (Wright, 1965).In this study, we detected an average pairwise F ST value of 0.341, and most of the F ST indices were detected as significant (p < .05)(Table 2).Nucleotide diversity for each population (π: 0.054-0.137)was lower than statistics calculated from wild populations of A. paeoniifolius (π: 0.273-0.285)(Gao et al., 2017).
These results suggested a pattern of generally low population genetic diversity and high genetic differentiation of A. yunnanensis.Similar patterns of genetic variation have also been observed in populations of other Amorphophallus species, such as A. konjac and A. albus (Gao et al., 2018(Gao et al., , 2021)).Amorphophallus are protogynous plant species, and the flowers send forth a strong odor to attract rove beetles as pollinators once pistils are mature (Tang et al., 2020).The limited dispersal ability of these insects might hinder the long-distance outbreeding of Amorphophallus species.
Besides, the complicated geomorphological features of the mountainous region of Southwest China have been shown to cause genetic differentiation in many plant species (e.g., Ju et al., 2018;Meng et al., 2015).Mantel test of IBD also suggested that geographic distance accounts for a significant amount of genomic variation in A. yunnanensis.Many studies on plants endemic to isolated habitats have suggested that geographical isolation as well as small population size leads to reductions in genetic variation due to genetic drift and limited gene flow (Han et al., 2017).Thus, geographical isolation and genetic drift may be a major driver of the genetic variation in species with fragmented habitats and low dispersal abilities, such as A. yunnanensis.

| Divergence and secondary contact between east and west groups
Allopatric divergence through geographical isolation is one common mechanism that generates biodiversity in mountain systems (Wiens & Graham, 2005).barriers to species dispersal (Ju et al., 2018;Xu et al., 2020).And mountain building has been described as a cause of the genetic differentiation within many plant species in Southwest China (Deng et al., 2020;Wambulwa et al., 2022).In this study, mountains of the Yunnan-Guizhou Plateau potentially created geographic isolation between populations in these two plateaus.Besides, both the Stairway plot and DIYABC discovered a recent decline in the west group (0.04-0.1 Mya), which coincides with the burst of the last interglacial (LIG).Considering the temperature fluctuations during the Quaternary, Quaternary glacial cycles potentially created niche differences between the Yunnan and Guizhou plateaus (Clift & Webb, 2019;Jia et al., 2020), driving the diversification of A. yunnanensis between these two regions.
In addition to two genetic clusters, we also detected genetic ad-

| Environmental adaptation
Evaluating the contribution of natural selection is important in studying adaptive evolution (Zhang et al., 2020).Our study revealed a significant pattern of isolation by environment (IBE) in A. yunnanensis, which suggested that environmental variables likely contributed to the genetic differentiation except the geographic isolation.Environmental fluctuations caused by mountain formation and isolation might lead to divergent selection (Ren et al., 2017).Outliers identified by GEA could have been targets of divergent selection (Forester et al., 2018;Zhang et al., 2020).BayPass identified highly differentiated SNPs that were

1
Morphological characteristics of Amorphophallus yunnanensis.(a) The plant of A. yunnanensis; (b) Variations in the petioles and flowers.

F
Geographical distribution and population genetic structure of Amorphophallus yunnanensis.(a) The sample locations of A. yunnanensis populations collected in Southwest China; (b) The STRUCTURE plot of 19 populations (K = 2-4).TA B L E 2 Genetic differentiation (pairwise FST ) values among 19 Amorphophallus yunnanensis populations based on genome-wide SNPs (below diagonal) and significance levels (above diagonal).

F
of Amorphophallus yunnanensis.Bootstrap values were shown at main nodes.

F
Principal components (PCs) of the variation among Amorphophallus yunnanensis individuals with PC1 versus PC2 (a), and PC1 versus PC3 (b).TA B L E 3 Results of the analyses of molecular variance (AMOVA) for genetic groups based on genomic SNPs.

F
Demographic models based on assumptions in Amorphophallus yunnanensis and corresponding posterior probabilities for scenario 2 in DIYABC analysis.(a) Illustrations of two alternative models.Scenario 1 stimulated that the two genetic groups diverged from the common ancestor at the time td, and the effect population sizes stayed stable.Scenario 2 proposed that historical population fluctuations have happened in two groups at the time ta and tb, respectively.(b) The posterior probability of two models estimated with logistic regression.(c) Model checking and PCA of observed data, comparing prior and posterior distributions of parameters in scenario 2.

F I G U R E 8
Maximum likelihood tree of Amorphophallus yunnanensis populations inferred by Treemix, with the gene flow events depicted with arrows.The branches with blue and orange colors indicate the east and west groups, respectively.
mixture in populations in the contact region.Meanwhile, the Stairway plot proposed a very recent population expansion of the east group (around 0.15 Mya), which suggested that range shifts of the clade caused by Quaternary climatic fluctuations might have led to the secondary contact (Hu et al., 2022; Liu et al., 2012).Treemix also detected gene flow from the east genetic group to populations in the contact region.Finally, a second hybridization caused by demographic expansion from the Guizhou Plateau could have occurred in the recent past, after the initial divergence of two clades, which has been reported in other species in the mountainous region of Southwest China (Ma et al., 2021).

F
Mantel tests of geographic, environmental, and genetic correlations based on SNP data.(a) Correlation of pairwise geographic distance versus pairwise genetic distance (F ST /(1−F ST )); (b) Correlation of pairwise environmental distance versus pairwise genetic distance.

Source of variation df Sum of squares Variance components Percentage of variation F-statistics p-Value
Posterior mean, median, range of 97.5% highest probability distribution (HPD) and mode for eight demographic parameters in scenario 2 for Amorphophallus yunnanensis (time is in generations).Note: N 1 and N 2 are the current effective population sizes of the west and east groups, respectively; N 1a and N 2a represent the population sizes of the two groups before population fluctuations; N s are the effective population size of the common ancestor of Amorphophallus yunnanensi.t a and t b represent the time of the population fluctuation for the west and east groups, respectively; and t d is the divergence time of the east and west groups.
Our previous study based on SSRs identified three genetic clusters in populations of A. yunnanensis.However, this study revealed a clear east-west genetic differentiation pattern, which separated populations distributed in the Guizhou plateau from those in the Yunnan plateau.AMOVA detected a significant amount of variation (49.26%, p < .0001) between the two genetic groups, which supported the phylogeographic pattern of A. yunnanensis.Several biogeographic boundaries have been proposed for southwest China, which acted as TA B L E 4